摘要 :
Genomic prediction is widely used to select candidates
for breeding. Size and composition of the reference
population are important factors influencing prediction
accuracy. In Holstein dairy cattle, large reference
populations...
展开
Genomic prediction is widely used to select candidates
for breeding. Size and composition of the reference
population are important factors influencing prediction
accuracy. In Holstein dairy cattle, large reference
populations are used, but this is difficult to achieve in
numerically small breeds and for traits that are not
routinely recorded. The prediction accuracy is usually
estimated using cross-validation, requiring the full data
set. It would be useful to have a method to predict the
benefit of multibreed reference populations that does
not require the availability of the full data set. Our
objective was to study the effect of the size and breed
composition of the reference population on the accuracy
of genomic prediction using genomic BLUP and
Bayes R. We also examined the effect of trait heritability
and validation breed on prediction accuracy. Using
these empirical results, we investigated the use of a
formula to predict the effect of the size and composition
of the reference population on the accuracy of genomic
prediction. Phenotypes were simulated in a data set
containing real genotypes of imputed sequence variants
for 22,752 dairy bulls and cows, including Holstein, Jersey,
Red Holstein, and Australian Red cattle. Different
reference populations were constructed, varying in size
and composition, to study within-breed, multibreed,
and across-breed prediction. Phenotypes were simulated
varying in heritability, number of chromosomes,
and number of quantitative trait loci. Genomic prediction
was carried out using genomic BLUP and Bayes R.
We used either the genomic relationship matrix (GRM)
to estimate the number of independent chromosomal
segments and subsequently to predict accuracy, or the
accuracies obtained from single-breed reference populations
to predict the accuracies of larger or multibreed
reference populations. Using the GRM overestimated
the accuracy; this overestimation was likely due to close
relationships among some of the reference animals.
Consequently, the GRM could not be used to predict
the accuracy of genomic prediction reliably. However,
a method using the prediction accuracies obtained by
cross-validation using a small, single-breed reference
population predicted the accuracy using a multibreed
reference population well and slightly overestimated
the accuracy for a larger reference population of the
same breed, but gave a reasonably close estimate of the
accuracy for a multibreed reference population. This
method could be useful for making decisions regarding
the size and composition of the reference population.
收起